Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 40
Filter
1.
BMC Bioinformatics ; 25(1): 164, 2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38664601

ABSTRACT

Multimodal integration combines information from different sources or modalities to gain a more comprehensive understanding of a phenomenon. The challenges in multi-omics data analysis lie in the complexity, high dimensionality, and heterogeneity of the data, which demands sophisticated computational tools and visualization methods for proper interpretation and visualization of multi-omics data. In this paper, we propose a novel method, termed Orthogonal Multimodality Integration and Clustering (OMIC), for analyzing CITE-seq. Our approach enables researchers to integrate multiple sources of information while accounting for the dependence among them. We demonstrate the effectiveness of our approach using CITE-seq data sets for cell clustering. Our results show that our approach outperforms existing methods in terms of accuracy, computational efficiency, and interpretability. We conclude that our proposed OMIC method provides a powerful tool for multimodal data analysis that greatly improves the feasibility and reliability of integrated data.


Subject(s)
Single-Cell Analysis , Cluster Analysis , Single-Cell Analysis/methods , Computational Biology/methods , Humans , Algorithms
3.
Article in English | MEDLINE | ID: mdl-37426065

ABSTRACT

Single cell RNA sequencing (scRNA-seq) technologies provide researchers with an unprecedented opportunity to exploit cell heterogeneity. For example, the sequenced cells belong to various cell lineages, which may have different cell fates in stem and progenitor cells. Those cells may differentiate into various mature cell types in a cell differentiation process. To trace the behavior of cell differentiation, researchers reconstruct cell lineages and predict cell fates by ordering cells chronologically into a trajectory with a pseudo-time. However, in scRNA-seq experiments, there are no cell-to-cell correspondences along with the time to reconstruct the cell lineages, which creates a significant challenge for cell lineage tracing and cell fate prediction. Therefore, methods that can accurately reconstruct the dynamic cell lineages and predict cell fates are highly desirable. In this article, we develop an innovative machine-learning framework called Cell Smoothing Transformation (CellST) to elucidate the dynamic cell fate paths and construct gene networks in cell differentiation processes. Unlike the existing methods that construct one single bulk cell trajectory, CellST builds cell trajectories and tracks behaviors for each individual cell. Additionally, CellST can predict cell fates even for less frequent cell types. Based on the individual cell fate trajectories, CellST can further construct dynamic gene networks to model gene-gene relationships along the cell differentiation process and discover critical genes that potentially regulate cells into various mature cell types.

4.
Zhonghua Wei Zhong Bing Ji Jiu Yi Xue ; 35(6): 573-577, 2023 Jun.
Article in Chinese | MEDLINE | ID: mdl-37366121

ABSTRACT

OBJECTIVE: To investigate the correlation of hemoglobin (Hb) level with prognosis of elderly patients diagnosed as sepsis. METHODS: A retrospective cohort study was conducted. Information on the cases of elderly patients with sepsis in the Medical Information Mart for Intensive Care-IV (MIMIC-IV), including basic information, blood pressure, routine blood test results [the Hb level of a patient was defined as his/her maximum Hb level from 6 hours before admission to intensive care unit (ICU) and 24 hours after admission to ICU], blood biochemical indexes, coagulation function, vital signs, severity score and outcome indicators were extracted. The curves of Hb level vs. 28-day mortality risk were developed by using the restricted cubic spline model based on the Cox regression analysis. The patients were divided into four groups (Hb < 100 g/L, 100 g/L ≤ Hb < 130 g/L, 130 g/L ≤ Hb < 150 g/L, Hb ≥ 150 g/L groups) based on these curves. The outcome indicators of patients in each group were analyzed, and the 28-day Kaplan-Meier survival curve was drawn. Logistic regression model and Cox regression model were used to analyze the relationship between Hb level and 28-day mortality risk in different groups. RESULTS: A total of 7 473 elderly patients with sepsis were included. There was a "U" curve relationship between Hb levels within 24 hours after ICU admission and the risk of 28-day mortality in patients with sepsis. The patients with 100 g/L ≤ Hb < 130 g/L had a lower risk of 28-day mortality. When Hb level was less than 100 g/L, the risk of death decreased gradually with the increase of Hb level. When Hb level was ≥ 130 g/L, the risk of death gradually increased with the increase of Hb level. Multivariate Logistic regression analysis revealed that the mortality risks of patients with Hb < 100 g/L [odds ratio (OR) = 1.44, 95% confidence interval (95%CI) was 1.23-1.70, P < 0.001] and Hb ≥ 150 g/L (OR = 1.77, 95%CI was 1.26-2.49, P = 0.001) increased significantly in the model involving all confounding factors; the mortality risks of patients with 130 g/L ≤ Hb < 150 g/L increased, while the difference was not statistically significant (OR = 1.21, 95%CI was 0.99-1.48, P = 0.057). The multivariate Cox regression analysis suggested that the mortality risks of patients with Hb < 100 g/L [hazard ratio (HR) = 1.27, 95%CI was 1.12-1.44, P < 0.001] and Hb ≥ 150 g/L (HR = 1.49, 95%CI was 1.16-1.93, P = 0.002) increased significantly in the model involving all confounding factors; the mortality risks of patients with 130 g/L ≤ Hb < 150 g/L increased, while the difference was not statistically significant (HR = 1.17, 95%CI was 0.99-1.37, P = 0.053). Kaplan-Meier survival curve showed that the 28-day survival rate of elderly septic patients in 100 g/L ≤ Hb < 130 g/L group was significantly higher than that in Hb < 100 g/L, 130 g/L ≤ Hb < 150 g/L and Hb ≥ 150 g/L groups (85.26% vs. 77.33%, 79.81%, 74.33%; Log-Rank test: χ2 = 71.850, P < 0.001). CONCLUSIONS: Elderly patients with sepsis exhibited low mortality risk if their 100 g/L ≤ Hb < 130 g/L within 24 hours after admission to ICU, and both higher and lower Hb levels led to increased mortality risks.


Subject(s)
Sepsis , Humans , Male , Female , Aged , Retrospective Studies , Sepsis/diagnosis , Critical Care , Intensive Care Units , Prognosis , Hemoglobins , ROC Curve
5.
J Am Stat Assoc ; 118(541): 135-146, 2023.
Article in English | MEDLINE | ID: mdl-37346228

ABSTRACT

With rapid advances in information technology, massive datasets are collected in all fields of science, such as biology, chemistry, and social science. Useful or meaningful information is extracted from these data often through statistical learning or model fitting. In massive datasets, both sample size and number of predictors can be large, in which case conventional methods face computational challenges. Recently, an innovative and effective sampling scheme based on leverage scores via singular value decompositions has been proposed to select rows of a design matrix as a surrogate of the full data in linear regression. Analogously, variable screening can be viewed as selecting rows of the design matrix. However, effective variable selection along this line of thinking remains elusive. In this article, we bridge this gap to propose a weighted leverage variable screening method by utilizing both the left and right singular vectors of the design matrix. We show theoretically and empirically that the predictors selected using our method can consistently include true predictors not only for linear models but also for complicated general index models. Extensive simulation studies show that the weighted leverage screening method is highly computationally efficient and effective. We also demonstrate its success in identifying carcinoma related genes using spatial transcriptome data.

6.
Adv Sci (Weinh) ; 10(19): e2300049, 2023 Jul.
Article in English | MEDLINE | ID: mdl-36967571

ABSTRACT

Bubbles in air are ephemeral because of gravity-induced drainage and liquid evaporation, which severely limits their applications, especially as intriguing bio/chemical reactors. In this work, a new approach using acoustic levitation combined with controlled liquid compensation to stabilize bubbles is proposed. Due to the suppression of drainage by sound field and prevention of capillary waves by liquid compensation, the bubbles can remain stable and intact permanently. It has been found that the acoustically levitated bubble shows a significantly enhanced particle adsorption ability because of the oscillation of the bubble and the presence of internal acoustic streaming. The results shed light on the development of novel air-purification techniques without consuming any solid filters.

7.
J Comput Graph Stat ; 31(3): 802-812, 2022.
Article in English | MEDLINE | ID: mdl-36407675

ABSTRACT

Smoothing splines have been used pervasively in nonparametric regressions. However, the computational burden of smoothing splines is significant when the sample size n is large. When the number of predictors d ≥ 2 , the computational cost for smoothing splines is at the order of O(n 3) using the standard approach. Many methods have been developed to approximate smoothing spline estimators by using q basis functions instead of n ones, resulting in a computational cost of the order O(nq 2). These methods are called the basis selection methods. Despite algorithmic benefits, most of the basis selection methods require the assumption that the sample is uniformly-distributed on a hyper-cube. These methods may have deteriorating performance when such an assumption is not met. To overcome the obstacle, we develop an efficient algorithm that is adaptive to the unknown probability density function of the predictors. Theoretically, we show the proposed estimator has the same convergence rate as the full-basis estimator when q is roughly at the order of O[n 2d/{(pr+1)(d +2)}] , where p ∈[1, 2] and r ≈ 4 are some constants depend on the type of the spline. Numerical studies on various synthetic datasets demonstrate the superior performance of the proposed estimator in comparison with mainstream competitors.

8.
IEEE Internet Things J ; 9(15): 13862-13875, 2022 Aug 01.
Article in English | MEDLINE | ID: mdl-36712176

ABSTRACT

Rapid and accurate detection and localization of electronic disturbances simultaneously are important for preventing its potential damages and determining potential remedies. Existing anomaly detection methods are severely limited by the low accuracy, the expensive computational cost and the need for highly trained personnel. There is an urgent need for a scalable online algorithm for in-field analysis of large-scale power electronics networks. In this paper, we propose a fast and accurate algorithm for anomaly detection and localization of power electronics networks: stratified colored-node graph (CONGO2). This algorithm hierarchically models the change of correlated waveforms and then correlated sensors using the colored-node graph. By aggregating the change of each sensor with its neighbors' inputs, we can spontaneously identify and localize the anomaly that cannot be detected by data collected from a single sensor. As our proposed method only focuses on the changes within a short time frame, it is highly computational efficient and only needs small data storage. Thus, our method is ideal for online and reliable anomaly detection and localization of large-scale power electronic networks. Compared to existing anomaly detection methods, our method is entirely data-driven without training data, highly accurate and reliable for wide-spectrum anomalies detection, and more importantly, capable of both detection and localization. Thus, it is ideal for in-field deployment for large-scale power electronic networks. As illustrated by a distributed energy resources (DERs) power grid with 37-node, our method can effectively detect and localize various cyber and physical attacks.

9.
J Phys Chem B ; 125(34): 9660-9667, 2021 09 02.
Article in English | MEDLINE | ID: mdl-34425052

ABSTRACT

Atomic force microscopy-single-molecule force spectroscopy (AFM-SMFS) is a powerful methodology to probe intermolecular and intramolecular interactions in biological systems because of its operability in physiological conditions, facile and rapid sample preparation, versatile molecular manipulation, and combined functionality with high-resolution imaging. Since a huge number of AFM-SMFS force-distance curves are collected to avoid human bias and errors and to save time, numerous algorithms have been developed to analyze the AFM-SMFS curves. Nevertheless, there is still a need to develop new algorithms for the analysis of AFM-SMFS data since the current algorithms cannot specify an unbinding force to a corresponding/each binding site due to the lack of networking functionality to model the relationship between the unbinding forces. To address this challenge, herein, we develop an unsupervised method, i.e., a network-based automatic clustering algorithm (NASA), to decode the details of specific molecules, e.g., the unbinding force of each binding site, given the input of AFM-SMFS curves. Using the interaction of heparan sulfate (HS)-antithrombin (AT) on different endothelial cell surfaces as a model system, we demonstrate that NASA is able to automatically detect the peak and calculate the unbinding force. More importantly, NASA successfully identifies three unbinding force clusters, which could belong to three different binding sites, for both Ext1f/f and Ndst1f/f cell lines. NASA has great potential to be applied either readily or slightly modified to other AFM-based SMFS measurements that result in "saw-tooth"-shaped force-distance curves showing jumps related to the force unbinding, such as antibody-antigen interaction and DNA-protein interaction.


Subject(s)
Algorithms , Binding Sites , Cluster Analysis , Humans , Microscopy, Atomic Force , Spectrum Analysis
10.
Biometrika ; 108(1): 149-166, 2021 Mar.
Article in English | MEDLINE | ID: mdl-34294943

ABSTRACT

Large samples are generated routinely from various sources. Classic statistical models, such as smoothing spline ANOVA models, are not well equipped to analyse such large samples because of high computational costs. In particular, the daunting computational cost of selecting smoothing parameters renders smoothing spline ANOVA models impractical. In this article, we develop an asympirical, i.e., asymptotic and empirical, smoothing parameters selection method for smoothing spline ANOVA models in large samples. The idea of our approach is to use asymptotic analysis to show that the optimal smoothing parameter is a polynomial function of the sample size and an unknown constant. The unknown constant is then estimated through empirical subsample extrapolation. The proposed method significantly reduces the computational burden of selecting smoothing parameters in high-dimensional and large samples. We show that smoothing parameters chosen by the proposed method tend to the optimal smoothing parameters that minimize a specific risk function. In addition, the estimator based on the proposed smoothing parameters achieves the optimal convergence rate. Extensive simulation studies demonstrate the numerical advantage of the proposed method over competing methods in terms of relative efficacy and running time. In an application to molecular dynamics data containing nearly one million observations, the proposed method has the best prediction performance.

11.
Sci Rep ; 11(1): 11432, 2021 06 01.
Article in English | MEDLINE | ID: mdl-34075074

ABSTRACT

Retinitis Pigmentosa (RP) is a mostly incurable inherited retinal degeneration affecting approximately 1 in 4000 individuals globally. The goal of this work was to identify drugs that can help patients suffering from the disease. To accomplish this, we screened drugs on a zebrafish autosomal dominant RP model. This model expresses a truncated human rhodopsin transgene (Q344X) causing significant rod degeneration by 7 days post-fertilization (dpf). Consequently, the larvae displayed a deficit in visual motor response (VMR) under scotopic condition. The diminished VMR was leveraged to screen an ENZO SCREEN-WELL REDOX library since oxidative stress is postulated to play a role in RP progression. Our screening identified a beta-blocker, carvedilol, that ameliorated the deficient VMR of the RP larvae and increased their rod number. Carvedilol may directly on rods as it affected the adrenergic pathway in the photoreceptor-like human Y79 cell line. Since carvedilol is an FDA-approved drug, our findings suggest that carvedilol can potentially be repurposed to treat autosomal dominant RP patients.


Subject(s)
Animals, Genetically Modified , Behavior, Animal/drug effects , Genetic Diseases, Inborn , Retinitis Pigmentosa , Rhodopsin , Vision, Ocular , Zebrafish , Animals , Animals, Genetically Modified/genetics , Animals, Genetically Modified/metabolism , Cell Line , Drug Evaluation, Preclinical , Genetic Diseases, Inborn/drug therapy , Genetic Diseases, Inborn/genetics , Genetic Diseases, Inborn/metabolism , Humans , Mutation , Retinal Rod Photoreceptor Cells , Retinitis Pigmentosa/drug therapy , Retinitis Pigmentosa/genetics , Retinitis Pigmentosa/metabolism , Rhodopsin/genetics , Rhodopsin/metabolism , Transgenes , Vision, Ocular/drug effects , Vision, Ocular/immunology , Zebrafish/genetics , Zebrafish/metabolism
12.
BMC Med Inform Decis Mak ; 21(1): 187, 2021 06 11.
Article in English | MEDLINE | ID: mdl-34116660

ABSTRACT

BACKGROUND: Extensive clinical evidence suggests that a preventive screening of coronary heart disease (CHD) at an earlier stage can greatly reduce the mortality rate. We use 64 two-dimensional speckle tracking echocardiography (2D-STE) features and seven clinical features to predict whether one has CHD. METHODS: We develop a machine learning approach that integrates a number of popular classification methods together by model stacking, and generalize the traditional stacking method to a two-step stacking method to improve the diagnostic performance. RESULTS: By borrowing strengths from multiple classification models through the proposed method, we improve the CHD classification accuracy from around 70-87.7% on the testing set. The sensitivity of the proposed method is 0.903 and the specificity is 0.843, with an AUC of 0.904, which is significantly higher than those of the individual classification models. CONCLUSION: Our work lays a foundation for the deployment of speckle tracking echocardiography-based screening tools for coronary heart disease.


Subject(s)
Coronary Disease , Echocardiography , Coronary Disease/diagnostic imaging , Humans , Machine Learning , Mass Screening , Reproducibility of Results , Risk Factors
13.
Microbiome ; 9(1): 57, 2021 02 26.
Article in English | MEDLINE | ID: mdl-33637135

ABSTRACT

BACKGROUND: Plants are naturally associated with root microbiota, which are microbial communities influential to host fitness. Thus, it is important to understand how plants control root microbiota. Epigenetic factors regulate the readouts of genetic information and consequently many essential biological processes. However, it has been elusive whether RNA-directed DNA methylation (RdDM) affects root microbiota assembly. RESULTS: By applying 16S rRNA gene sequencing, we investigated root microbiota of Arabidopsis mutants defective in the canonical RdDM pathway, including dcl234 that harbors triple mutation in the Dicer-like proteins DCL3, DCL2, and DCL4, which produce small RNAs for RdDM. Alpha diversity analysis showed reductions in microbe richness from the soil to roots, reflecting the selectivity of plants on root-associated bacteria. The dcl234 triple mutation significantly decreases the levels of Aeromonadaceae and Pseudomonadaceae, while it increases the abundance of many other bacteria families in the root microbiota. However, mutants of the other examined key players in the canonical RdDM pathway showed similar microbiota as Col-0, indicating that the DCL proteins affect root microbiota in an RdDM-independent manner. Subsequently gene analysis by shotgun sequencing of root microbiome indicated a selective pressure on microbial resistance to plant defense in the dcl234 mutant. Consistent with the altered plant-microbe interactions, dcl234 displayed altered characters, including the mRNA and sRNA transcriptomes that jointly highlighted altered cell wall organization and up-regulated defense, the decreased cellulose and callose deposition in root xylem, and the restructured profile of root exudates that supported the alterations in gene expression and cell wall modifications. CONCLUSION: Our findings demonstrate an important role of the DCL proteins in influencing root microbiota through integrated regulation of plant defense, cell wall compositions, and root exudates. Our results also demonstrate that the canonical RdDM is dispensable for Arabidopsis root microbiota. These findings not only establish a connection between root microbiota and plant epigenetic factors but also highlight the complexity of plant regulation of root microbiota. Video abstract.


Subject(s)
Arabidopsis/metabolism , Arabidopsis/microbiology , DNA Methylation/genetics , Microbiota , Plant Roots/microbiology , RNA, Plant , Ribonuclease III/metabolism , Arabidopsis/genetics , Gene Expression Regulation, Plant/genetics , Microbiota/genetics , Plant Roots/genetics , RNA, Ribosomal, 16S/genetics , Ribonuclease III/genetics
14.
Biometrika ; 107(3): 723-735, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32831354

ABSTRACT

We consider the problem of approximating smoothing spline estimators in a nonparametric regression model. When applied to a sample of size [Formula: see text], the smoothing spline estimator can be expressed as a linear combination of [Formula: see text] basis functions, requiring [Formula: see text] computational time when the number [Formula: see text] of predictors is two or more. Such a sizeable computational cost hinders the broad applicability of smoothing splines. In practice, the full-sample smoothing spline estimator can be approximated by an estimator based on [Formula: see text] randomly selected basis functions, resulting in a computational cost of [Formula: see text]. It is known that these two estimators converge at the same rate when [Formula: see text] is of order [Formula: see text], where [Formula: see text] depends on the true function and [Formula: see text] depends on the type of spline. Such a [Formula: see text] is called the essential number of basis functions. In this article, we develop a more efficient basis selection method. By selecting basis functions corresponding to approximately equally spaced observations, the proposed method chooses a set of basis functions with great diversity. The asymptotic analysis shows that the proposed smoothing spline estimator can decrease [Formula: see text] to around [Formula: see text] when [Formula: see text]. Applications to synthetic and real-world datasets show that the proposed method leads to a smaller prediction error than other basis selection methods.

15.
Adv Neural Inf Process Syst ; 33: 4015-4028, 2020 Dec.
Article in English | MEDLINE | ID: mdl-38737390

ABSTRACT

Sufficient dimension reduction is used pervasively as a supervised dimension reduction approach. Most existing sufficient dimension reduction methods are developed for data with a continuous response and may have an unsatisfactory performance for the categorical response, especially for the binary-response. To address this issue, we propose a novel estimation method of sufficient dimension reduction subspace (SDR subspace) using optimal transport. The proposed method, named principal optimal transport direction (POTD), estimates the basis of the SDR subspace using the principal directions of the optimal transport coupling between the data respecting different response categories. The proposed method also reveals the relationship among three seemingly irrelevant topics, i.e., sufficient dimension reduction, support vector machine, and optimal transport. We study the asymptotic properties of POTD and show that in the cases when the class labels contain no error, POTD estimates the SDR subspace exclusively. Empirical studies show POTD outperforms most of the state-of-the-art linear dimension reduction methods.

16.
Article in English | MEDLINE | ID: mdl-38737400

ABSTRACT

Testing the hypothesis of parallelism is a fundamental statistical problem arising from many applied sciences. In this paper, we develop a nonparametric parallelism test for inferring whether the trends are parallel in treatment and control groups. In particular, the proposed nonparametric parallelism test is a Wald type test based on a smoothing spline ANOVA (SSANOVA) model which can characterize the complex patterns of the data. We derive that the asymptotic null distribution of the test statistic is a Chi-square distribution, unveiling a new version of Wilks phenomenon. Notably, we establish the minimax sharp lower bound of the distinguishable rate for the nonparametric parallelism test by using the information theory, and further prove that the proposed test is minimax optimal. Simulation studies are conducted to investigate the empirical performance of the proposed test. DNA methylation and neuroimaging studies are presented to illustrate potential applications of the test. The software is available at https://github.com/BioAlgs/Parallelism.

17.
Proc Mach Learn Res ; 89: 2301-2311, 2019 Apr.
Article in English | MEDLINE | ID: mdl-31187096

ABSTRACT

Estimating the dependence structure of multidimensional time series data in real-time is challenging. With large volumes of streaming data, the problem becomes more difficult when the multidimensional data are collected asynchronously across distributed nodes, which motivates us to sample representative data points from streams. We propose a leverage score sampling (LSS) method for efficient online inference of the streaming vector autoregressive (VAR) model. We define the leverage score for the streaming VAR model so that the LSS method selects informative data points in real-time with statistical guarantees of parameter estimation efficiency. Moreover, our LSS method can be directly deployed in an asynchronous decentralized environment, e.g., a sensor network without a fusion center, and produce asynchronous consensus online parameter estimation over time. By exploiting the temporal dependence structure of the VAR model, the LSS method selects samples independently on each dimension and thus is able to update the estimation asynchronously. We illustrate the effectiveness of the LSS method in synthetic, gas sensor and seismic datasets.

18.
PLoS One ; 14(2): e0212234, 2019.
Article in English | MEDLINE | ID: mdl-30768618

ABSTRACT

Many contemporary neuroscience experiments utilize high-throughput approaches to simultaneously collect behavioural data from many animals. The resulting data are often complex in structure and are subjected to systematic biases, which require new approaches for analysis and normalization. This study addressed the normalization need by establishing an approach based on linear-regression modeling. The model was established using a dataset of visual motor response (VMR) obtained from several strains of wild-type (WT) zebrafish collected at multiple stages of development. The VMR is a locomotor response triggered by drastic light change, and is commonly measured repeatedly from multiple larvae arrayed in 96-well plates. This assay is subjected to several systematic variations. For example, the light emitted by the machine varies slightly from well to well. In addition to the light-intensity variation, biological replication also created batch-batch variation. These systematic variations may result in differences in the VMR and must be normalized. Our normalization approach explicitly modeled the effect of these systematic variations on VMR. It also normalized the activity profiles of different conditions to a common baseline. Our approach is versatile, as it can incorporate different normalization needs as separate factors. The versatility was demonstrated by an integrated normalization of three factors: light-intensity variation, batch-batch variation and baseline. After normalization, new biological insights were revealed from the data. For example, we found larvae of TL strain at 6 days post-fertilization (dpf) responded to light onset much stronger than the 9-dpf larvae, whereas previous analysis without normalization shows that their responses were relatively comparable. By removing systematic variations, our model-based normalization can facilitate downstream statistical comparisons and aid detecting true biological differences in high-throughput studies of neurobehaviour.


Subject(s)
Behavior, Animal/physiology , Databases, Factual , Motor Activity/physiology , Zebrafish/physiology , Animals
19.
Sci Data ; 4: 170182, 2017 12 12.
Article in English | MEDLINE | ID: mdl-29231925

ABSTRACT

Retinal degeneration often affects the whole retina even though the disease-causing gene is specifically expressed in the light-sensitive photoreceptors. The molecular basis of the retinal defect can potentially be determined by gene-expression profiling of the whole retina. In this study, we measured the gene-expression profile of retinas microdissected from a zebrafish pde6cw59 (pde6c) mutant. This retinal-degeneration model not only displays cone degeneration caused by a cone-specific mutation, but also other secondary cellular changes starting from 4 days postfertilization (dpf). To capture the underlying molecular changes, we subjected pde6c and wild-type (WT) retinas at 5 dpf/ 120 h postfertilization (hpf) to RNA sequencing (RNA-Seq) on the Illumina HiSeq 2,000 platform. We also validated the RNA-Seq results by Reverse Transcription Quantitative Polymerase Chain Reaction (RT-qPCR) of seven phototransduction genes. Our analyses indicate that the RNA-Seq dataset was of high quality, and effectively captured the molecular changes in the whole pde6c retina. This dataset will facilitate the characterization of the molecular defects in the pde6c retina at the initial stage of retinal degeneration.


Subject(s)
Cyclic Nucleotide Phosphodiesterases, Type 6/genetics , Retina/metabolism , Retinal Degeneration/genetics , Zebrafish Proteins/genetics , Animals , Microarray Analysis , Transcriptome , Zebrafish
20.
Genome Biol ; 18(1): 187, 2017 10 03.
Article in English | MEDLINE | ID: mdl-28974263

ABSTRACT

A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. We describe a statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. As a trade-off, we require multiple metagenomic samples, usually ≥10 samples, to get highly accurate binning results. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. We demonstrated the performance of this new method through both simulation and real metagenomic studies. The MetaGen software is available at https://github.com/BioAlgs/MetaGen .


Subject(s)
Metagenomics/methods , Bayes Theorem , Diabetes Mellitus, Type 2/microbiology , Humans , Inflammatory Bowel Diseases/microbiology , Obesity/microbiology , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...